Runtime system level fault tolerance for a distributed functional language

نویسندگان

  • Philip W. Trinder
  • Robert F. Pointon
  • Hans-Wolfgang Loidl
چکیده

Distributed Fault Tolerance entails detecting errors, confining the damage caused, recovery from the errors, and providing continued service on a network of co-operating machines. Functional languages potentially offer benefits for distributed fault tolerance: many computations are pure, and hence have no side-effects to be reversed during error recovery. Moreover functional languages have a high-level runtime system (RTS) where computations and data are readily manipulated. We propose a new RTS level of fault tolerance for distributed functional languages, and outline a design for its implementation for the GdH language. Glasgow distributed Haskell is a small extension to the Haskell language and the fault tolerance design utilises existing distributed graph reduction mechanisms. The design distinguishes between pure and impure computations; impure or side effecting computations must be recovered using conventional exceptionbased techniques, but the RTS attempts implicit backward recovery of pure computations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transparent fault tolerance for scalable functional computation

Reliability is set to become a major concern on emergent large-scale architectures. While there are many parallel languages, and indeed many parallel functional languages, very few address reliability. The notable exception is the widely emulated Erlang distributed actor model that provides explicit supervision and recovery of actors with isolated state. We investigate scalable transparent faul...

متن کامل

A Tool for Constructing Service Replication Systems

Service replication is a key to providing high availability, fault tolerance and good performance in distributed systems. However, building a service replication system is a di cult and complex task. This paper describes a tool that mimics the design of the remote procedure call (RPC) system to support building distributed service replication systems. The tool includes an interface de nition la...

متن کامل

Operational Semantics for Declarative Networking

Declarative Networking has been recently promoted as a high-level programming paradigm to more conveniently describe and implement systems that run in a distributed fashion over a computer network. It has already been used to implement various networked systems, e.g., network overlays, Byzantine fault tolerance protocols, and distributed hash tables. Declarative Networking relies upon a rule-ba...

متن کامل

Runtime Verification for Ultra-Critical Systems

Runtime verification (RV) is a natural fit for ultra-critical systems, where correctness is imperative. In ultra-critical systems, even if the software is fault-free, because of the inherent unreliability of commodity hardware and the adversity of operational environments, processing units (and their hosted software) are replicated, and fault-tolerant algorithms are used to compare the outputs....

متن کامل

The HiPE/x86 Erlang Compiler: System Description and Performance Evaluation

Erlang is a concurrent functional language, tailored for large-scale distributed and fault-tolerant control software. Its primary implementation is Ericsson’s Erlang/OTP system, which is based on a virtual machine interpreter. HiPE (High-Performance Erlang) adds a native code execution mode to the Erlang/OTP system. This paper describes the x86 version of HiPE, including a detailed account of d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000